Machine learning dislocation density correlations and solute effects in Mg-based alloys

Magnesium alloys, among the lightest structural materials, represent excellent candidates for lightweight applications. However, industrial applications remain limited due to relatively low strength and ductility. Solid solution alloying has been shown to enhance Mg ductility and formability at relatively low concentrations. Zn solutes are significantly cost effective and common. However, the intrinsic mechanisms by which the addition of solutes leads to ductility improvement remain controversial. Here, by using a high throughput analysis of intragranular characteristics through data science approaches, we study the evolution of dislocation density in polycrystalline Mg and also, Mg–Zn alloys. We apply machine learning techniques in comparing electron back-scatter diffraction (EBSD) images of the samples before/after alloying and before/after deformation to extract the strain history of individual grains, and to predict the dislocation density level after alloying and after deformation. Our results are promising given that moderate predictions (coefficient of determination \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$R^2$$\end{document}R2 ranging from 0.25 to 0.32) are achieved already with a relatively small dataset (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sim$$\end{document}∼ 5000 sub-millimeter grains).

Supplementary information to "Machine learning dislocation density correlations and solute effects in Mg-based alloys" Salmenjoki et al.

Supplementary Note 1: Data acquisition
To collect the corresponding grain data before and after loading, we started by finding the correct coordinates of the same position in the both images. We started by computing a map of mean squared difference (MSD) between the pixel orientation of 50 × 50 pixel sub images between before and after loading data. Then we collected the MSD map into a coordinate map by first retrieving the coordinates of the minimum in the MSD map for every sub image and then (median) filtering the coordinate map to reduce noise and achieve a smooth coordinate map between the two images. Obviously, the correspondence of the grains are not perfect as the after image has been deformed versus the before image but this way we get adequate matches as illustrated by the examples in Supplementary Fig. 1. But this way, we were able to collect a dataset where we have the features of the initial grain and the ρ GN D in the pixels of the post-deformation image corresponding to the initial grain. However as the figure shows, we lose some pixels that go missing due to new forming grain boundaries, inaccuracy in finding the correct pixels or changing shape of the grains. To counter the first, i.e. new grain boundaries, we took twinning into account by merging grains with grain boundaries matching the twinning misorientation in the after image (following the procedure documented in the MTEX manual [1]). This way we were able to reduce the number of missing pixels in some cases but, as noted in the original publication presenting the data, twinning is not dominant deformation mechanism in the studied samples [2]. One example of a twinning grain is illustrated in Supplementary  size s circumference max width max height ρGND/s dρGND,1(4 µm) dρGND,2(4 µm) dρGND,3(4 µm) orientation parameters φ1, Φ, φ2 Grain Average Misorientation (GAM) sum of misorientations between nbrs θn number of neighbors n nbr average misorientation θn/n nbr Grain Orientation Spread (GOS) Supplementary Note 3: SVM training details The SVM was implemented with the scikit-learn library [3]. Our SVM used the radial basis function kernel. The hyperparameter, "tube" was set to 0.25 and the penalty parameter C ≈ 1.5 was deduced by finding the optimal value according to the loss on validation grains as illustrated in Supplementary Fig. 3.
Supplementary Figure 3. MSE error of SVM prediction for training and validation sets as a function of hyperparameter C.
Supplementary Note 4: Missing pixels in the EBSD image and impact on the SVM predictions As discussed in the main text and above in Supplementary note 1, finding one-to-one correspondence of grains in images before and after loading has several challenges. One is the increasing amount of noisy pixels in the after image. This arises at least partially from new, forming grain boundaries [4], but also drastic local orientation differences lead to missing values of ρ GN D in the after image. Due to the missing pixels, the target value for which the SVM mapping is done contains noise and it affects the SVM prediction success. To elaborate, Supplementary Fig. 4 shows the absolute error versus the fraction of missing pixels in the image after loading (i.e. 0 all pixels found, 1 all pixels missing) for single grain predictions along with the average curve. And as expected, the error of SVM output increases with the number of missing pixels and does so drastically when approaching 1.
Supplementary Figure 4. Absolute error between the true and predicted values versus the fraction of missing pixels in the grain after loading for single grains (points) and the mean (line). misorientation θ length of grain boundary direction vector g = (g1, g2) pointing from one grain center to another Supplementary Note 6: GN training details For the GN architecture, we used the encode-process-decode machine [5]. The idea of the model is to use the features of nodes (grains) and edges (grain boundaries between neighboring nodes) to predict node-wise variable (the target is log ρ GN D /s after loading). First, the model encodes the data, both node and edge features, into latent space and then processes the data by passing the latent representation of the features to neighbor nodes. The number of processing steps then decides on how 'far' the interactions are considered to apply (e.g. with three processing steps a node receives the data of all nodes reachable in three steps via the message-passing). Finally after the processing steps, the processed data is decoded from the latent space back to the desired form (i.e. the target).

Supplementary Note 5: grain boundary fetures used
We implemented the GN with the Deepmind's Graph Nets library [5]. For training, we applied the same idea as e.g. in [6] where the number of processing steps is fixed to some value but the GN is encouraged to solve the problem with as few steps as possible. We used five steps which is quite many considering that not necessarily grains that many jumps apart affect each other. And indeed, Supplementary Fig. 5a shows the training and validation loss during the training and the smallest validation loss is achieved with three processing steps. Other parameter values used for our model: 32 latent parameters for nodes, 16 latent parameters for edges, two hidden layers in encoder, decoder and processor. Learning rate was set to 5 · 10 −6 and the training was conducted with the early-stopping criterion, i.e. according to Supplementary Fig. 5a, the optimal GN was obtained after approximately 5000 epochs where the validation loss was at its minimum.
To further elaborate what the GN has learned at the end of the training, Supplementary Figs. 5b-c show the latent representation of the nodes after three processing steps in the space of reduced dimensions (t-SNE). In Supplementary  Fig. 5b, the coloring is according the target value the GN is trying to learn and there are clearly separated regions with high ρ GN D /s and low ρ GN D /s grains. Furthermore in Supplementary Fig. 5c, the coloring is according to twinning / not twinning and again some separation can be seen in the latent space (i.e. grains that are more inclined to twinning).